In [5]:
import seaborn as sns
%matplotlib inline
In [6]:
tips = sns.load_dataset('tips')
tips.head()
Out[6]:
In [8]:
sns.barplot(x='sex',y='total_bill',data=tips)
Out[8]:
In [10]:
import numpy as np
You can change the estimator object to your own function, that converts a vector to a scalar:
In [11]:
sns.barplot(x='sex',y='total_bill',data=tips,estimator=np.std)
Out[11]:
In [13]:
sns.countplot(x='sex',data=tips)
Out[13]:
boxplots and violinplots are used to shown the distribution of categorical data. A box plot (or box-and-whisker plot) shows the distribution of quantitative data in a way that facilitates comparisons between variables or across levels of a categorical variable. The box shows the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for points that are determined to be “outliers” using a method that is a function of the inter-quartile range.
In [22]:
sns.boxplot(x="day", y="total_bill", data=tips,palette='rainbow')
Out[22]:
In [25]:
# Can do entire dataframe with orient='h'
sns.boxplot(data=tips,palette='rainbow',orient='h')
Out[25]:
In [26]:
sns.boxplot(x="day", y="total_bill", hue="smoker",data=tips, palette="coolwarm")
Out[26]:
A violin plot plays a similar role as a box and whisker plot. It shows the distribution of quantitative data across several levels of one (or more) categorical variables such that those distributions can be compared. Unlike a box plot, in which all of the plot components correspond to actual datapoints, the violin plot features a kernel density estimation of the underlying distribution.
In [27]:
sns.violinplot(x="day", y="total_bill", data=tips,palette='rainbow')
Out[27]:
In [37]:
sns.violinplot(x="day", y="total_bill", data=tips,hue='sex',palette='Set1')
Out[37]:
In [36]:
sns.violinplot(x="day", y="total_bill", data=tips,hue='sex',split=True,palette='Set1')
Out[36]:
The stripplot will draw a scatterplot where one variable is categorical. A strip plot can be drawn on its own, but it is also a good complement to a box or violin plot in cases where you want to show all observations along with some representation of the underlying distribution.
The swarmplot is similar to stripplot(), but the points are adjusted (only along the categorical axis) so that they don’t overlap. This gives a better representation of the distribution of values, although it does not scale as well to large numbers of observations (both in terms of the ability to show all the points and in terms of the computation needed to arrange them).
In [38]:
sns.stripplot(x="day", y="total_bill", data=tips)
Out[38]:
In [39]:
sns.stripplot(x="day", y="total_bill", data=tips,jitter=True)
Out[39]:
In [42]:
sns.stripplot(x="day", y="total_bill", data=tips,jitter=True,hue='sex',palette='Set1')
Out[42]:
In [43]:
sns.stripplot(x="day", y="total_bill", data=tips,jitter=True,hue='sex',palette='Set1',split=True)
Out[43]:
In [44]:
sns.swarmplot(x="day", y="total_bill", data=tips)
Out[44]:
In [47]:
sns.swarmplot(x="day", y="total_bill",hue='sex',data=tips, palette="Set1", split=True)
Out[47]:
In [61]:
sns.violinplot(x="tip", y="day", data=tips,palette='rainbow')
sns.swarmplot(x="tip", y="day", data=tips,color='black',size=3)
Out[61]:
In [15]:
sns.factorplot(x='sex',y='total_bill',data=tips,kind='bar')
Out[15]: